Blogs

A Systems-Based Approach to Metadata Management

Author: Andy Johnston

The importance of automation in metadata management for media organizations cannot be overstated.  With so much data available across so many sources, it would be impossible to attain commercially sensible targets manually.  As studios, broadcasters and streaming services capitalize upon descriptive metadata to power the monetisation and distribution of content, they still require transparency in understanding how their metadata is organized and enhanced to meet their needs better, and what happens during any automated process.

 In keeping with providing a cost-effective platform, automation ensures accuracy and timeliness – the cornerstones of a trusted platform tasked with delivering consistent, high-quality output. At MetaBroadcast, we look at each function as a system and apply rules to automate the processes within that system. These functions or systems include data ingest, data cleansing, equivalence, genre classification, hierarchy healing and other capabilities important to consolidating and managing metadata. Each system is a microservice with a clearly defined interface, intent and its own set of transparent rules. 

Billions of rows of data are created every day. Systems-based automation helps video service providers establish the rules for what happens to that data. Rules define what data needs to be ingested, how that data will be mapped and cleansed, and then make sense of that data in context with other data. 

Each set of rules is defined based on the functional requirements of the microservice. For example, rules for data cleansing create automated processes to eliminate files with specific attributes. This reduces the number of files progressing to the next step in a defined workflow.  Alternatively, data feeding specific fields can be prioritized based on preferred sources when unifying data from multiple sources.

Rules-based automation improves accuracy and efficiency, equivalence and information security:

Accuracy:

  • Automation helps identify errors and maintain the accuracy of metadata by updating it in real-time, as changes occur in the underlying data. 
  • For example, if a data field such as release date does not contain a four-digit number reflecting an existing calendar year, it should be flagged for review and correction.  Automated processes identify inaccurate data fields in near real-time.

Efficiency:

  • Manually consolidating, cleansing and managing metadata can be a time-consuming process – particularly with data ingested from multiple sources and spread across multiple platforms
  • Automated metadata management tools can perform these tasks in a fraction of the time, freeing up resources to focus on other tasks

Equivalence:

  • At its most basic, equivalence is identifying, matching and linking data from different sources.
  • Our automated equivalence process uses a series of rules to identify and score concepts and relationships between subject and candidate content records that may be considered for matching.

Information Security:

  • The security of cloud-based platforms is imperative when managing the creation of high-quality metadata and ensuring data pipelines flow as expected. 
  • Automating security processes reduces the possibility of human configuration errors and also frees up time to focus on tasks critical to managing metadata. 

A systems-based approach is dependent on each system’s adherence to the use of defined and diverse data types, holistic visibility of the integrated systems, overall scalability (often enabled by the cloud’s elasticity), ease of collaboration between and integration with customer platforms, and the ability to link various assets based on logical relationships. 

This is the foundation for our approach to metadata management. The value of our system lies in its automation, transparency and repeatability. Atlas orchestrates data workflows and automates processes for normalizing, matching and unifying metadata based upon an agreed data schema. Atlas automates and iterates the review of data hierarchies to ensure compliance with the data schema and availability of data related to TV brand, series, and episode or film collections and franchises.

Atlas was developed as a cloud-native application using Amazon Web Services and complying with AWS recommendations for security, scalability, reliability and, ultimately, cost-effectiveness for our clients. The result is visible in unified, high-quality metadata repositories that are complete, accurate, consistent and relevant to each customer – enabling internal users to more readily discover, review and utilize the data describing valuable content. 

We’ll be at IBC2023 from 15-18 September. Schedule a Meeting.